Feature Selection Ensemble

نویسندگان

  • Qiang Shen
  • Ren Diao
  • Pan Su
چکیده

Many strategies have been exploited for the task of feature selection, in an effort to identify more compact and better quality feature subsets. Such techniques typically involve the use of an individual feature significance evaluation, or a measurement of feature subset consistency, that work together with a search algorithm in order to determine a quality subset. Feature selection ensemble aims to combine the outputs of multiple feature selectors, thereby producing a more robust result for the subsequent classifier learning tasks. In this paper, three novel implementations of the feature selection ensemble concept are introduced, generalising the ensemble approach so that it can be used in conjunction with many subset evaluation techniques, and search algorithms. A recently developed heuristic algorithm: harmony search is employed to demonstrate the approaches. Results of experimental comparative studies are reported in order to highlight the benefits of the present work. The paper ends with a proposal to extend the application of feature selection ensemble to aiding the development of biped robots (inspired by the authors’ involvement in the joint celebration of Olympic and the centenary of the birth of Alan Turing).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ensemble Classification and Extended Feature Selection for Credit Card Fraud Detection

Due to the rise of technology, the possibility of fraud in different areas such as banking has been increased. Credit card fraud is a crucial problem in banking and its danger is over increasing. This paper proposes an advanced data mining method, considering both feature selection and decision cost for accuracy enhancement of credit card fraud detection. After selecting the best and most effec...

متن کامل

سودمندی رگرسیون‌های تجمیعی و روش‌های انتخاب متغیرهای پیش‌بین بهینه در پیش‌بینی بازده سهام

مقاله حاضر به بررسی سودمندی رگرسیون‌های تجمیعی و روش‌های انتخاب متغیرهای پیش‌بین بهینه (شامل روش مبتنی بر همبستگی و ریلیف) برای پیش‌بینی بازده سهام شرکت‌های پذیرفته شده در بورس اوراق بهادار تهران می‌پردازد. به‌منظور ارزیابی عملکرد رگرسیون تجمیعی، معیارهای ارزیابی (شامل میانگین قدرمطلق درصد خطا، مجذور مربع میانگین خطا و ضریب تعیین) مربوط به پیش‌بینی این روش، با رگرسیون خطی و شبکه‌های عصبی مصنوعی...

متن کامل

MLIFT: Enhancing Multi-label Classifier with Ensemble Feature Selection

Multi-label classification has gained significant attention during recent years, due to the increasing number of modern applications associated with multi-label data. Despite its short life, different approaches have been presented to solve the task of multi-label classification. LIFT is a multi-label classifier which utilizes a new strategy to multi-label learning by leveraging label-specific ...

متن کامل

The prediction of lymphedema via the combination of the selected data mining algorithms

Background: Breast cancer is the second leading cause of cancer death in women, after lung cancer. Due to the importance of predicting this disease, the use of data mining methods in medical research is more significant than before. Data mining algorithms can be a great help in preventing the development of lymphedema in patients. The aim Of this study was to create a diagnosis system that can ...

متن کامل

Enhanced Classification Accuracy for Cardiotocogram Data with Ensemble Feature Selection and Classifier Ensemble

In this paper ensemble learning based feature selection and classifier ensemble model is proposed to improve classification accuracy. The hypothesis is that good feature sets contain features that are highly correlated with the class from ensemble feature selection to SVM ensembles which can be achieved on the performance of classification accuracy. The proposed approach consists of two phases:...

متن کامل

Combining Feature Selection and Ensemble Learning for Software Quality Estimation

High dimensionality is a major problem that affects the quality of training datasets and therefore classification models. Feature selection is frequently used to deal with this problem. The goal of feature selection is to choose the most relevant and important attributes from the raw dataset. Another major challenge to building effective classification models from binary datasets is class imbal...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012